CSCI417/ECEN425: Machine Intelligence-FALL22

Team Members:

SHrouk Hesham 19106271¶

Hussin Fekry 19105777¶

Omnia Salah 19106208¶

Abdelrahman mahmoud 19104609¶

Under Supervision:

Dr.Ghada Khoriba & Eng.Aly Abdelmageed

"Pneumonia Disease Detection using Machine Intelligence Techniques to Diagnose Acute Respiratory Failure"

image.png

Introduction

1.1 What is Pneumonia ?

Pneumonia is an inflammatory condition of the lung is affected primarily the small air sacs known as alveoli. Symptoms typically include some combination of productive or dry cough, chest pain, fever and difficulty breathing. The severity of the condition is variable. Pneumonia is usually caused by infection with viruses or bacteria. Pneumonia isn’t just one disease. It is a family of more than 200 different lung diseases. Statistics for Pneumonia disease detection in Egypt detected that Pneumonia deaths is about 13,393 with 2.50 % and Rate of 20.69 with a World Rank #100. According to the latest World Health Organization (WHO) data published in 2020 Lung Disease Deaths in Egypt reached 13,393 or 2.50% of total deaths. Our main goal is earlier detection of Pneumonia by predicting a patient’s severity of decline in lung function and based on Chest X-ray of their lungs. Using image processing and machine intelligence techniques to help Pneumonia impacted patients and produce a prediction with the images of Pneumonia.

1.2 What is CNN ?

image.png CNN stands for Convolutional Neural Network which is a specialized neural network for processing data that has an input shape like a 2D matrix like images. CNN's are typically used for image detection and classification.

1.3 What is Transfer Learning ?

image.png Transfer learning is a machine learning technique where a model trained on one task is re-purposed on a second related task.

Summary

1 - Import packages & Load Librarie

2 - Data exploration

2.1 - Directory Setup for Train-validation-test

2.2 - Visualization

3 - Data Augmentation

4 - Data Preprocessing

4.1 - Helper Functions

5 - Model Building

5.1 - VGG16

5.2 - MobileNetV2

5.3 - DenseNet169

5.4 - InceptionV3

6 - Final conclusion


1 - Import packages & Load Libraries

In [1]:
!pip install opencv-python
Requirement already satisfied: opencv-python in d:\anaconda\lib\site-packages (4.7.0.68)
Requirement already satisfied: numpy>=1.17.0 in d:\anaconda\lib\site-packages (from opencv-python) (1.21.5)
In [2]:
# Making all necessary imports
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import os
import glob
import json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
import cv2
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D,GlobalAveragePooling2D, Flatten, Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.optimizers import Adam

import warnings
warnings.filterwarnings('ignore')

import logging
logger = tf.get_logger()
logger.setLevel(logging.ERROR)
print(tf.__version__)
2.4.1
In [3]:
def get_files(base_dir, target_dir):
    count = 0
    path = get_path(base_dir, target_dir)
    for dirname, _, filenames in os.walk(path):
        for filename in filenames:
            count+=len(glob.glob(os.path.join(dirname, filename)))
        return path, count

def get_path(base_dir, target_dir):
    path = os.path.join(base_dir,target_dir)
    return path

2 - Data exploration

Dataset analysis is a necessary step before creating models

3 folders are provided to us:

Dataset/
    test/
        opacity/
        normal/
    train/
        opacity/
        normal/
    val/
        opacity/
        normal/

2.1 - Directory Setup for Train-validation-test

In [4]:
base_dir = 'D:/Machine Proj/Project/Dataset'

train_normal_dir = 'D:/Machine Proj/Project/Dataset/train/normal'
train_pneumonia_dir = 'D:/Machine Proj/Project/Dataset/train/opacity'

val_normal_dir = 'D:/Machine Proj/Project/Dataset/val/normal'
val_pneumonia_dir = 'D:/Machine Proj/Project/Dataset/val/opacity'

test_normal_dir = 'D:/Machine Proj/Project/Dataset/test/normal'
test_pneumonia_dir = 'D:/Machine Proj/Project/Dataset/test/opacity'


train_normal_path, train_normal_count = get_files(base_dir,train_normal_dir)
train_pneumonia_path, train_pneumonia_count = get_files(base_dir,train_pneumonia_dir)

val_normal_path, val_normal_count = get_files(base_dir,val_normal_dir)
val_pneumonia_path, val_pneumonia_count = get_files(base_dir,val_pneumonia_dir)

test_normal_path, test_normal_count = get_files(base_dir,test_normal_dir)
test_pneumonia_path, test_pneumonia_count = get_files(base_dir,test_pneumonia_dir)

print("No of Train Images: {}".format(train_normal_count + train_pneumonia_count))
print(" \u2022 No of Normal Images {}".format(train_normal_count))
print(" \u2022 No of Pneumonia Images {}".format(train_pneumonia_count))

print("No of Validation Images: {}".format(val_normal_count + val_pneumonia_count))
print(" \u2022 No of Normal Images {}".format(val_normal_count))
print(" \u2022 No of Pneumonia Images {}".format(val_pneumonia_count))

print("No of Test Images: {}".format(test_normal_count + test_pneumonia_count))
print(" \u2022 No of Normal Images {}".format(test_normal_count))
print(" \u2022 No of Pneumonia Images {}".format(test_pneumonia_count))
No of Train Images: 4192
 • No of Normal Images 1082
 • No of Pneumonia Images 3110
No of Validation Images: 1040
 • No of Normal Images 267
 • No of Pneumonia Images 773
No of Test Images: 624
 • No of Normal Images 234
 • No of Pneumonia Images 390
In [5]:
train_data = []
for filename in os.listdir(train_normal_path):
    train_data.append((os.path.join(train_normal_path,filename),0))

for filename in os.listdir(train_pneumonia_path):
    train_data.append((os.path.join(train_pneumonia_path,filename),1))

train_data = pd.DataFrame(train_data, columns=['image_path', 'label'], index=None)
train_data = train_data.sample(frac=1).reset_index(drop=True)
        
val_data = []
for filename in os.listdir(val_normal_path):
    val_data.append((os.path.join(val_normal_path,filename),0))

for filename in os.listdir(val_pneumonia_path):
    val_data.append((os.path.join(val_pneumonia_path,filename),1))
        
val_data = pd.DataFrame(val_data, columns=['image_path', 'label'], index=None)
        
test_data = []
for filename in os.listdir(test_normal_path):
    test_data.append((os.path.join(test_normal_path,filename),0))

for filename in os.listdir(test_pneumonia_path):
    test_data.append((os.path.join(test_pneumonia_path,filename),1))

test_data = pd.DataFrame(test_data, columns=['image_path', 'label'], index=None)

print("Train Data {}".format(train_data.shape))
print("Validation Data {}".format(val_data.shape))
print("Test Data {}".format(test_data.shape))
Train Data (4192, 2)
Validation Data (1040, 2)
Test Data (624, 2)
In [6]:
train_data
Out[6]:
image_path label
0 ../input/pneumonia-xray-images/train/opacity/p... 1
1 ../input/pneumonia-xray-images/train/opacity/p... 1
2 ../input/pneumonia-xray-images/train/opacity/p... 1
3 ../input/pneumonia-xray-images/train/normal/IM... 0
4 ../input/pneumonia-xray-images/train/opacity/p... 1
... ... ...
4187 ../input/pneumonia-xray-images/train/opacity/p... 1
4188 ../input/pneumonia-xray-images/train/opacity/p... 1
4189 ../input/pneumonia-xray-images/train/opacity/p... 1
4190 ../input/pneumonia-xray-images/train/normal/IM... 0
4191 ../input/pneumonia-xray-images/train/opacity/p... 1

4192 rows × 2 columns

In [7]:
class_dict = {0:'Normal', 1:'Pneumonia'}
train_data['class_name'] = train_data.label.map(class_dict)
train_data['class_name'].value_counts().plot(kind='bar')
Out[7]:
<AxesSubplot:>
In [9]:
for filepath in train_data.image_path:
    image = cv2.imread(filepath)
    image_size = image.shape
    break
image_size
Out[9]:
(1006, 1404, 3)

2.2 - Visualization

In [10]:
def visualize_img(images):
    fig = plt.figure(figsize=(20, 15))
    for i,path in enumerate(images):
        fig.add_subplot(4, 4, i+1, xticks=[], yticks=[])
        img = cv2.imread(path)
        plt.imshow(img)
        plt.title(train_data[train_data.image_path == path].class_name.values[0])
        
for i in range(2):
    images = train_data[train_data.label == i].image_path
    images = np.random.choice(images , 8)
    visualize_img(images)
In [11]:
def plotImages(images_arr):
    fig, axes = plt.subplots(1, 5, figsize=(20,20))
    axes = axes.flatten()
    for img, ax in zip(images_arr, axes):
        ax.imshow(img)
    plt.tight_layout()
    plt.show()

3 - Data Augmentation

In [12]:
BATCH_SIZE = 32
IMG_SHAPE  = 224

train_image_gen = ImageDataGenerator(rescale=1./255,
                                     width_shift_range=0.1,
                                     height_shift_range=0.1,
                                     brightness_range=[0.2,1.0],
                                     zoom_range=0.2,
                                     horizontal_flip=True,
                                     fill_mode='nearest')

train_gen = train_image_gen.flow_from_dataframe(train_data,
                                              x_col='image_path',
                                              y_col='class_name',
                                              class_mode='binary',
                                              batch_size=BATCH_SIZE,
                                              shuffle=True,
                                              target_size=(IMG_SHAPE,IMG_SHAPE))
Found 4192 validated image filenames belonging to 2 classes.
In [23]:
augmented_images = [train_gen[0][0][2] for i in range(5)]
plotImages(augmented_images)

4 - Data Preprocessing

In [24]:
from sklearn.preprocessing import LabelBinarizer
from tensorflow.keras.utils import to_categorical

train_lb = to_categorical(train_data.label, dtype = int)
val_lb = to_categorical(val_data.label, dtype=int)

train_data = train_data.reset_index().drop(labels='index', axis=1)
y_train = pd.DataFrame(train_lb).add_prefix('label_')

val_data = val_data.reset_index().drop(labels='index', axis=1)
y_val = pd.DataFrame(val_lb).add_prefix('label_')

train_data = pd.concat([train_data, y_train], axis=1)
val_data = pd.concat([val_data, y_val], axis=1)

print("Training set has {} samples".format(train_data.shape[0]))
print("Validation set has {} samples".format(val_data.shape[0]))
Training set has 4192 samples
Validation set has 1040 samples

4.1 - Helper Functions

In [25]:
BATCH_SIZE = 32
IMG_SHAPE  = 224
EPOCHS = 20

def gen():
    train_image_gen = ImageDataGenerator(rescale=1./255,
                                         width_shift_range=0.1,
                                         height_shift_range=0.1,
                                         brightness_range=[0.2,1.0],
                                         zoom_range=0.2,
                                         horizontal_flip=True,
                                         vertical_flip=True,
                                         fill_mode='nearest')

    train_gen = train_image_gen.flow_from_dataframe(train_data,
                                              x_col='image_path',
                                              y_col=[f'label_{x}' for x in np.arange(2)],
                                              class_mode='raw',
                                              batch_size=BATCH_SIZE,
                                              shuffle=True,
                                              target_size=(IMG_SHAPE,IMG_SHAPE))


    val_image_gen = ImageDataGenerator(rescale=1./255)

    val_gen = val_image_gen.flow_from_dataframe(val_data,
                                              x_col='image_path',
                                              y_col= [f'label_{x}' for x in np.arange(2)],
                                              class_mode='raw',
                                              batch_size=BATCH_SIZE,
                                              target_size=(IMG_SHAPE,IMG_SHAPE))
    return train_gen, val_gen
In [26]:
def plot(history):

    training_accuracy = history.history['accuracy']
    validation_accuracy = history.history['val_accuracy']

    training_loss = history.history['loss']
    validation_loss = history.history['val_loss']

    epochs_range=range(len(training_accuracy))

    plt.figure(figsize=(8, 8))
    plt.subplot(1, 2, 1)
    plt.plot(epochs_range, training_accuracy, label='Training Accuracy')
    plt.plot(epochs_range, validation_accuracy, label='Validation Accuracy')
    plt.legend(loc='lower right')
    plt.title('Training and Validation Accuracy')

    plt.subplot(1, 2, 2)
    plt.plot(epochs_range, training_loss, label='Training Loss')
    plt.plot(epochs_range, validation_loss, label='Validation Loss')
    plt.legend(loc='upper right')
    plt.title('Training and Validation Loss')
    plt.show()
In [27]:
from PIL import Image
def predict(image_path, model):
    im = cv2.imread(image_path)
    test_image = np.asarray(im)
    processed_test_image = process_image(test_image)
    processed_test_image = np.expand_dims(processed_test_image, axis = 0)
    
    ps = model.predict(processed_test_image)
    return ps
    
def process_image(image):
    image = tf.cast(image , tf.float32)
    image = tf.image.resize(image , (224 , 224))
    image = image/255
    image = image.numpy()
    return image

5 - Model Building

5.1 - VGG16

In [28]:
from tensorflow.keras.applications.vgg16 import VGG16

base = VGG16(weights = 'imagenet', include_top = False, input_shape = (224, 224, 3))
tf.keras.backend.clear_session()

for layer in base.layers:
    layer.trainable = False
    
vgg_model = Sequential()
vgg_model.add(base)
vgg_model.add(GlobalAveragePooling2D())
vgg_model.add(BatchNormalization())
vgg_model.add(Dense(256, activation='relu'))
vgg_model.add(Dropout(0.5))
vgg_model.add(BatchNormalization())
vgg_model.add(Dense(128, activation='relu'))
vgg_model.add(Dropout(0.5))
vgg_model.add(Dense(2, activation='softmax'))

vgg_model.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
58892288/58889256 [==============================] - 0s 0us/step
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
vgg16 (Functional)           (None, 7, 7, 512)         14714688  
_________________________________________________________________
global_average_pooling2d (Gl (None, 512)               0         
_________________________________________________________________
batch_normalization (BatchNo (None, 512)               2048      
_________________________________________________________________
dense (Dense)                (None, 256)               131328    
_________________________________________________________________
dropout (Dropout)            (None, 256)               0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 256)               1024      
_________________________________________________________________
dense_1 (Dense)              (None, 128)               32896     
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 258       
=================================================================
Total params: 14,882,242
Trainable params: 166,018
Non-trainable params: 14,716,224
_________________________________________________________________
In [34]:
train_gen, val_gen = gen()

optm = Adam(lr=0.0001)
vgg_model.compile(loss='binary_crossentropy', optimizer=optm, 
                  metrics=['accuracy'])

EarlyStopping = EarlyStopping(monitor='val_loss',
                              min_delta=.0001,
                              patience=3,
                              verbose=1,
                              mode='auto',
                              restore_best_weights=True)

model_save = ModelCheckpoint('./vgg16_model.h5',
                             save_best_only = True,
                             save_weights_only = False,
                             monitor = 'val_loss', 
                             mode = 'min', verbose = 1)


vgg_history = vgg_model.fit(train_gen,
                             steps_per_epoch = train_gen.samples // BATCH_SIZE,
                             epochs = 20,
                             validation_data = val_gen,
                             callbacks=[EarlyStopping, model_save],
                             class_weight = class_weights)
Found 4192 validated image filenames.
Found 1040 validated image filenames.
Epoch 1/20
131/131 [==============================] - 154s 1s/step - loss: 0.8175 - accuracy: 0.6022 - val_loss: 0.6654 - val_accuracy: 0.6798

Epoch 00001: val_loss improved from inf to 0.66541, saving model to ./vgg16_model.h5
Epoch 2/20
131/131 [==============================] - 96s 737ms/step - loss: 0.6610 - accuracy: 0.6953 - val_loss: 0.6565 - val_accuracy: 0.5587

Epoch 00002: val_loss improved from 0.66541 to 0.65652, saving model to ./vgg16_model.h5
Epoch 3/20
131/131 [==============================] - 97s 744ms/step - loss: 0.5702 - accuracy: 0.7364 - val_loss: 0.6720 - val_accuracy: 0.5452

Epoch 00003: val_loss did not improve from 0.65652
Epoch 4/20
131/131 [==============================] - 101s 770ms/step - loss: 0.5277 - accuracy: 0.7616 - val_loss: 0.5857 - val_accuracy: 0.6712

Epoch 00004: val_loss improved from 0.65652 to 0.58567, saving model to ./vgg16_model.h5
Epoch 5/20
131/131 [==============================] - 101s 773ms/step - loss: 0.4665 - accuracy: 0.8013 - val_loss: 0.5460 - val_accuracy: 0.7298

Epoch 00005: val_loss improved from 0.58567 to 0.54595, saving model to ./vgg16_model.h5
Epoch 6/20
131/131 [==============================] - 98s 750ms/step - loss: 0.4442 - accuracy: 0.8037 - val_loss: 0.4674 - val_accuracy: 0.7827

Epoch 00006: val_loss improved from 0.54595 to 0.46740, saving model to ./vgg16_model.h5
Epoch 7/20
131/131 [==============================] - 97s 743ms/step - loss: 0.4312 - accuracy: 0.8251 - val_loss: 0.4376 - val_accuracy: 0.8087

Epoch 00007: val_loss improved from 0.46740 to 0.43764, saving model to ./vgg16_model.h5
Epoch 8/20
131/131 [==============================] - 97s 740ms/step - loss: 0.3915 - accuracy: 0.8244 - val_loss: 0.3768 - val_accuracy: 0.8298

Epoch 00008: val_loss improved from 0.43764 to 0.37685, saving model to ./vgg16_model.h5
Epoch 9/20
131/131 [==============================] - 99s 753ms/step - loss: 0.3973 - accuracy: 0.8382 - val_loss: 0.4190 - val_accuracy: 0.8096

Epoch 00009: val_loss did not improve from 0.37685
Epoch 10/20
131/131 [==============================] - 101s 775ms/step - loss: 0.3613 - accuracy: 0.8405 - val_loss: 0.3667 - val_accuracy: 0.8452

Epoch 00010: val_loss improved from 0.37685 to 0.36667, saving model to ./vgg16_model.h5
Epoch 11/20
131/131 [==============================] - 98s 747ms/step - loss: 0.3427 - accuracy: 0.8565 - val_loss: 0.3431 - val_accuracy: 0.8548

Epoch 00011: val_loss improved from 0.36667 to 0.34311, saving model to ./vgg16_model.h5
Epoch 12/20
131/131 [==============================] - 98s 750ms/step - loss: 0.3791 - accuracy: 0.8448 - val_loss: 0.3543 - val_accuracy: 0.8471

Epoch 00012: val_loss did not improve from 0.34311
Epoch 13/20
131/131 [==============================] - 96s 737ms/step - loss: 0.3611 - accuracy: 0.8546 - val_loss: 0.3471 - val_accuracy: 0.8510

Epoch 00013: val_loss did not improve from 0.34311
Epoch 14/20
131/131 [==============================] - 97s 744ms/step - loss: 0.3182 - accuracy: 0.8738 - val_loss: 0.3377 - val_accuracy: 0.8529

Epoch 00014: val_loss improved from 0.34311 to 0.33768, saving model to ./vgg16_model.h5
Epoch 15/20
131/131 [==============================] - 96s 733ms/step - loss: 0.3369 - accuracy: 0.8561 - val_loss: 0.3083 - val_accuracy: 0.8596

Epoch 00015: val_loss improved from 0.33768 to 0.30827, saving model to ./vgg16_model.h5
Epoch 16/20
131/131 [==============================] - 97s 738ms/step - loss: 0.3379 - accuracy: 0.8638 - val_loss: 0.3041 - val_accuracy: 0.8673

Epoch 00016: val_loss improved from 0.30827 to 0.30413, saving model to ./vgg16_model.h5
Epoch 17/20
131/131 [==============================] - 96s 733ms/step - loss: 0.3522 - accuracy: 0.8608 - val_loss: 0.3468 - val_accuracy: 0.8500

Epoch 00017: val_loss did not improve from 0.30413
Epoch 18/20
131/131 [==============================] - 96s 734ms/step - loss: 0.3254 - accuracy: 0.8581 - val_loss: 0.2951 - val_accuracy: 0.8683

Epoch 00018: val_loss improved from 0.30413 to 0.29512, saving model to ./vgg16_model.h5
Epoch 19/20
131/131 [==============================] - 97s 736ms/step - loss: 0.3668 - accuracy: 0.8482 - val_loss: 0.3941 - val_accuracy: 0.8288

Epoch 00019: val_loss did not improve from 0.29512
Epoch 20/20
131/131 [==============================] - 96s 732ms/step - loss: 0.3221 - accuracy: 0.8632 - val_loss: 0.2957 - val_accuracy: 0.8712

Epoch 00020: val_loss did not improve from 0.29512
In [35]:
plot(vgg_history)
In [37]:
vgg_pred =[]
for image in test_data.image_path:
    vgg_pred.append(predict(image , vgg_model))
    
final_vgg_pred  = np.argmax(vgg_pred , axis=-1)
actual_label = test_data['label']

print(classification_report(actual_label, final_vgg_pred))
matrix=confusion_matrix(actual_label, final_vgg_pred)
sns.heatmap(matrix,square=True, annot=True, fmt='d', cbar=False,
            xticklabels=['0', '1'],
            yticklabels=['0', '1'])
plt.xlabel('Predicted label')
plt.ylabel('True label');
              precision    recall  f1-score   support

           0       0.77      0.89      0.83       234
           1       0.93      0.84      0.88       390

    accuracy                           0.86       624
   macro avg       0.85      0.87      0.86       624
weighted avg       0.87      0.86      0.86       624

In [39]:
print(vgg_history.history['val_accuracy'][-3])
print(vgg_history.history['val_loss'][-3])
0.8682692050933838
0.2951194941997528

5.2 - MobileNetV2

In [40]:
from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2

base = MobileNetV2(weights = 'imagenet', include_top = False, input_shape = (224, 224, 3))
tf.keras.backend.clear_session()
    
for layer in base.layers:
    layer.trainable =  False

mobilenet_model = Sequential()
mobilenet_model.add(base)
mobilenet_model.add(GlobalAveragePooling2D())
mobilenet_model.add(BatchNormalization())
mobilenet_model.add(Dense(256, activation='relu'))
mobilenet_model.add(Dropout(0.5))
mobilenet_model.add(BatchNormalization())
mobilenet_model.add(Dense(128, activation='relu'))
mobilenet_model.add(Dropout(0.5))
mobilenet_model.add(Dense(2, activation='softmax'))

mobilenet_model.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5
9412608/9406464 [==============================] - 0s 0us/step
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
mobilenetv2_1.00_224 (Functi (None, 7, 7, 1280)        2257984   
_________________________________________________________________
global_average_pooling2d (Gl (None, 1280)              0         
_________________________________________________________________
batch_normalization (BatchNo (None, 1280)              5120      
_________________________________________________________________
dense (Dense)                (None, 256)               327936    
_________________________________________________________________
dropout (Dropout)            (None, 256)               0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 256)               1024      
_________________________________________________________________
dense_1 (Dense)              (None, 128)               32896     
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 258       
=================================================================
Total params: 2,625,218
Trainable params: 364,162
Non-trainable params: 2,261,056
_________________________________________________________________
In [44]:
train_gen, val_gen = gen()

optm = Adam(lr=0.0001)
mobilenet_model.compile(loss='binary_crossentropy', optimizer=optm, 
                  metrics=['accuracy'])

EarlyStopping = EarlyStopping(monitor='val_loss',
                              min_delta=.0001,
                              patience=3,
                              verbose=1,
                              mode='auto',
                              restore_best_weights=True)

model_save = ModelCheckpoint('./mobilenetV2.h5',
                             save_best_only = True,
                             save_weights_only = False,
                             monitor = 'val_loss', 
                             mode = 'min', verbose = 1)


mob_history = mobilenet_model.fit(train_gen,
                              steps_per_epoch = train_gen.samples // BATCH_SIZE,
                              epochs = EPOCHS,
                              validation_data = val_gen,
                              callbacks=[EarlyStopping, model_save])
Found 4192 validated image filenames.
Found 1040 validated image filenames.
Epoch 1/20
131/131 [==============================] - 98s 730ms/step - loss: 0.7897 - accuracy: 0.6004 - val_loss: 0.4541 - val_accuracy: 0.8500

Epoch 00001: val_loss improved from inf to 0.45408, saving model to ./mobilenetV2.h5
Epoch 2/20
131/131 [==============================] - 94s 720ms/step - loss: 0.4763 - accuracy: 0.8183 - val_loss: 0.3734 - val_accuracy: 0.8462

Epoch 00002: val_loss improved from 0.45408 to 0.37338, saving model to ./mobilenetV2.h5
Epoch 3/20
131/131 [==============================] - 94s 717ms/step - loss: 0.3754 - accuracy: 0.8612 - val_loss: 0.2983 - val_accuracy: 0.8760

Epoch 00003: val_loss improved from 0.37338 to 0.29825, saving model to ./mobilenetV2.h5
Epoch 4/20
131/131 [==============================] - 94s 719ms/step - loss: 0.3288 - accuracy: 0.8731 - val_loss: 0.2885 - val_accuracy: 0.8721

Epoch 00004: val_loss improved from 0.29825 to 0.28847, saving model to ./mobilenetV2.h5
Epoch 5/20
131/131 [==============================] - 94s 715ms/step - loss: 0.2855 - accuracy: 0.8919 - val_loss: 0.2511 - val_accuracy: 0.8971

Epoch 00005: val_loss improved from 0.28847 to 0.25111, saving model to ./mobilenetV2.h5
Epoch 6/20
131/131 [==============================] - 94s 718ms/step - loss: 0.2626 - accuracy: 0.8992 - val_loss: 0.2335 - val_accuracy: 0.9077

Epoch 00006: val_loss improved from 0.25111 to 0.23352, saving model to ./mobilenetV2.h5
Epoch 7/20
131/131 [==============================] - 94s 716ms/step - loss: 0.2499 - accuracy: 0.9072 - val_loss: 0.2273 - val_accuracy: 0.9058

Epoch 00007: val_loss improved from 0.23352 to 0.22725, saving model to ./mobilenetV2.h5
Epoch 8/20
131/131 [==============================] - 96s 730ms/step - loss: 0.2306 - accuracy: 0.9154 - val_loss: 0.2161 - val_accuracy: 0.9077

Epoch 00008: val_loss improved from 0.22725 to 0.21608, saving model to ./mobilenetV2.h5
Epoch 9/20
131/131 [==============================] - 93s 710ms/step - loss: 0.2201 - accuracy: 0.9248 - val_loss: 0.2136 - val_accuracy: 0.9096

Epoch 00009: val_loss improved from 0.21608 to 0.21362, saving model to ./mobilenetV2.h5
Epoch 10/20
131/131 [==============================] - 94s 716ms/step - loss: 0.2271 - accuracy: 0.9127 - val_loss: 0.1906 - val_accuracy: 0.9240

Epoch 00010: val_loss improved from 0.21362 to 0.19060, saving model to ./mobilenetV2.h5
Epoch 11/20
131/131 [==============================] - 94s 716ms/step - loss: 0.2344 - accuracy: 0.9072 - val_loss: 0.2223 - val_accuracy: 0.9077

Epoch 00011: val_loss did not improve from 0.19060
Epoch 12/20
131/131 [==============================] - 94s 715ms/step - loss: 0.2009 - accuracy: 0.9250 - val_loss: 0.2177 - val_accuracy: 0.9058

Epoch 00012: val_loss did not improve from 0.19060
Epoch 13/20
131/131 [==============================] - 94s 716ms/step - loss: 0.2195 - accuracy: 0.9127 - val_loss: 0.2025 - val_accuracy: 0.9173
Restoring model weights from the end of the best epoch.

Epoch 00013: val_loss did not improve from 0.19060
Epoch 00013: early stopping
In [45]:
plot(mob_history)
In [46]:
mob_pred =[]
for image in test_data.image_path:
    mob_pred.append(predict(image , mobilenet_model))
    
final_mob_pred  = np.argmax(mob_pred , axis=-1)
actual_label = test_data['label']

print(classification_report(actual_label, final_mob_pred))
matrix=confusion_matrix(actual_label, final_mob_pred)
sns.heatmap(matrix,square=True, annot=True, fmt='d', cbar=False,
            xticklabels=['0', '1'],
            yticklabels=['0', '1'])
plt.xlabel('Predicted label')
plt.ylabel('True label');
              precision    recall  f1-score   support

           0       0.88      0.85      0.86       234
           1       0.91      0.93      0.92       390

    accuracy                           0.90       624
   macro avg       0.89      0.89      0.89       624
weighted avg       0.90      0.90      0.90       624

In [47]:
print(mob_history.history['val_accuracy'][-4])
print(mob_history.history['val_loss'][-4])
0.9240384697914124
0.1905973255634308

5.3 - DenseNet169

In [48]:
from tensorflow.keras.applications.densenet import DenseNet169
from tensorflow.keras.applications.densenet import preprocess_input as densenet_preprocess

base = DenseNet169(weights = 'imagenet', include_top = False, input_shape = (224, 224, 3))
tf.keras.backend.clear_session()

for layer in base.layers:
    layer.trainable =  False

densenet_model = Sequential()
densenet_model.add(base)
densenet_model.add(GlobalAveragePooling2D())
densenet_model.add(BatchNormalization())
densenet_model.add(Dense(256, activation='relu'))
densenet_model.add(Dropout(0.5))
densenet_model.add(BatchNormalization())
densenet_model.add(Dense(128, activation='relu'))
densenet_model.add(Dropout(0.5))
densenet_model.add(Dense(2, activation='softmax'))

densenet_model.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/densenet/densenet169_weights_tf_dim_ordering_tf_kernels_notop.h5
51879936/51877672 [==============================] - 0s 0us/step
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
densenet169 (Functional)     (None, 7, 7, 1664)        12642880  
_________________________________________________________________
global_average_pooling2d (Gl (None, 1664)              0         
_________________________________________________________________
batch_normalization (BatchNo (None, 1664)              6656      
_________________________________________________________________
dense (Dense)                (None, 256)               426240    
_________________________________________________________________
dropout (Dropout)            (None, 256)               0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 256)               1024      
_________________________________________________________________
dense_1 (Dense)              (None, 128)               32896     
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 258       
=================================================================
Total params: 13,109,954
Trainable params: 463,234
Non-trainable params: 12,646,720
_________________________________________________________________
In [52]:
train_gen, val_gen = gen()

optm = Adam(lr=0.0001)
densenet_model.compile(loss='binary_crossentropy', optimizer=optm, 
                  metrics=['accuracy'])

EarlyStopping = EarlyStopping(monitor='val_loss',
                              min_delta=.0001,
                              patience=3,
                              verbose=1,
                              mode='auto',
                              restore_best_weights=True)

model_save = ModelCheckpoint('./densenet169.h5',
                             save_best_only = True,
                             save_weights_only = False,
                             monitor = 'val_loss', 
                             mode = 'min', verbose = 1)


dense_history = densenet_model.fit(train_gen,
                              steps_per_epoch = train_gen.samples // BATCH_SIZE,
                              epochs = EPOCHS,
                              validation_data = val_gen,
                              callbacks=[EarlyStopping, model_save])
Found 4192 validated image filenames.
Found 1040 validated image filenames.
Epoch 1/20
131/131 [==============================] - 109s 768ms/step - loss: 0.8379 - accuracy: 0.5811 - val_loss: 0.4822 - val_accuracy: 0.8596

Epoch 00001: val_loss improved from inf to 0.48221, saving model to ./densenet169.h5
Epoch 2/20
131/131 [==============================] - 99s 755ms/step - loss: 0.4852 - accuracy: 0.8235 - val_loss: 0.2789 - val_accuracy: 0.9337

Epoch 00002: val_loss improved from 0.48221 to 0.27895, saving model to ./densenet169.h5
Epoch 3/20
131/131 [==============================] - 98s 743ms/step - loss: 0.3651 - accuracy: 0.8727 - val_loss: 0.2052 - val_accuracy: 0.9452

Epoch 00003: val_loss improved from 0.27895 to 0.20515, saving model to ./densenet169.h5
Epoch 4/20
131/131 [==============================] - 97s 744ms/step - loss: 0.3211 - accuracy: 0.8859 - val_loss: 0.1630 - val_accuracy: 0.9529

Epoch 00004: val_loss improved from 0.20515 to 0.16301, saving model to ./densenet169.h5
Epoch 5/20
131/131 [==============================] - 97s 741ms/step - loss: 0.2703 - accuracy: 0.9098 - val_loss: 0.1440 - val_accuracy: 0.9567

Epoch 00005: val_loss improved from 0.16301 to 0.14397, saving model to ./densenet169.h5
Epoch 6/20
131/131 [==============================] - 96s 737ms/step - loss: 0.2594 - accuracy: 0.9120 - val_loss: 0.1354 - val_accuracy: 0.9577

Epoch 00006: val_loss improved from 0.14397 to 0.13536, saving model to ./densenet169.h5
Epoch 7/20
131/131 [==============================] - 97s 742ms/step - loss: 0.2400 - accuracy: 0.9163 - val_loss: 0.1398 - val_accuracy: 0.9548

Epoch 00007: val_loss did not improve from 0.13536
Epoch 8/20
131/131 [==============================] - 97s 737ms/step - loss: 0.2261 - accuracy: 0.9214 - val_loss: 0.1261 - val_accuracy: 0.9615

Epoch 00008: val_loss improved from 0.13536 to 0.12606, saving model to ./densenet169.h5
Epoch 9/20
131/131 [==============================] - 97s 738ms/step - loss: 0.2188 - accuracy: 0.9200 - val_loss: 0.1352 - val_accuracy: 0.9548

Epoch 00009: val_loss did not improve from 0.12606
Epoch 10/20
131/131 [==============================] - 97s 739ms/step - loss: 0.1961 - accuracy: 0.9271 - val_loss: 0.1310 - val_accuracy: 0.9587

Epoch 00010: val_loss did not improve from 0.12606
Epoch 11/20
131/131 [==============================] - 96s 735ms/step - loss: 0.2053 - accuracy: 0.9205 - val_loss: 0.1284 - val_accuracy: 0.9577
Restoring model weights from the end of the best epoch.

Epoch 00011: val_loss did not improve from 0.12606
Epoch 00011: early stopping
In [53]:
plot(dense_history)
In [54]:
dense_pred =[]
for image in test_data.image_path:
    dense_pred.append(predict(image , densenet_model))
    
final_dense_pred  = np.argmax(dense_pred , axis=-1)
actual_label = test_data['label']

print(classification_report(actual_label, final_dense_pred))
matrix=confusion_matrix(actual_label, final_dense_pred)
sns.heatmap(matrix,square=True, annot=True, fmt='d', cbar=False,
            xticklabels=['0', '1'],
            yticklabels=['0', '1'])
plt.xlabel('Predicted label')
plt.ylabel('True label');
              precision    recall  f1-score   support

           0       0.91      0.82      0.86       234
           1       0.90      0.95      0.92       390

    accuracy                           0.90       624
   macro avg       0.90      0.88      0.89       624
weighted avg       0.90      0.90      0.90       624

In [55]:
print(dense_history.history['val_accuracy'][-4])
print(dense_history.history['val_loss'][-4])
0.9615384340286255
0.12605918943881989

5.4 - InceptionV3

In [68]:
from tensorflow.keras.applications import InceptionV3
base = InceptionV3(weights = 'imagenet', include_top = False, input_shape = (224, 224, 3))
tf.keras.backend.clear_session()
for layer in base.layers:   
    layer.trainable = False

incept_model = Sequential()
incept_model.add(base)
incept_model.add(GlobalAveragePooling2D())
incept_model.add(BatchNormalization())
incept_model.add(Dense(256, activation='relu'))
incept_model.add(Dropout(0.5))
incept_model.add(BatchNormalization())
incept_model.add(Dense(128, activation='relu'))
incept_model.add(Dropout(0.5))
incept_model.add(Dense(2, activation='softmax'))

incept_model.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/inception_v3/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5
87916544/87910968 [==============================] - 1s 0us/step
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
inception_v3 (Functional)    (None, 5, 5, 2048)        21802784  
_________________________________________________________________
global_average_pooling2d (Gl (None, 2048)              0         
_________________________________________________________________
batch_normalization (BatchNo (None, 2048)              8192      
_________________________________________________________________
dense (Dense)                (None, 256)               524544    
_________________________________________________________________
dropout (Dropout)            (None, 256)               0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 256)               1024      
_________________________________________________________________
dense_1 (Dense)              (None, 128)               32896     
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 258       
=================================================================
Total params: 22,369,698
Trainable params: 562,306
Non-trainable params: 21,807,392
_________________________________________________________________
In [72]:
train_gen, val_gen = gen()

optm = Adam(lr=0.0001)
incept_model.compile(loss='binary_crossentropy', optimizer=optm, 
                  metrics=['accuracy'])

EarlyStopping = EarlyStopping(monitor='val_loss',
                              min_delta=.0001,
                              patience=3,
                              verbose=1,
                              mode='auto',
                              restore_best_weights=True)

model_save = ModelCheckpoint('./inceptionV3.h5',
                             save_best_only = True,
                             save_weights_only = False,
                             monitor = 'val_loss', 
                             mode = 'min', verbose = 1)


incept_history = incept_model.fit(train_gen,
                              steps_per_epoch = train_gen.samples // BATCH_SIZE,
                              epochs = EPOCHS,
                              validation_data = val_gen,
                              callbacks=[EarlyStopping, model_save])
Found 4192 validated image filenames.
Found 1040 validated image filenames.
Epoch 1/20
131/131 [==============================] - 102s 741ms/step - loss: 0.8389 - accuracy: 0.5565 - val_loss: 0.4376 - val_accuracy: 0.8510

Epoch 00001: val_loss improved from inf to 0.43757, saving model to ./inceptionV3.h5
Epoch 2/20
131/131 [==============================] - 96s 730ms/step - loss: 0.4806 - accuracy: 0.8190 - val_loss: 0.3199 - val_accuracy: 0.8904

Epoch 00002: val_loss improved from 0.43757 to 0.31991, saving model to ./inceptionV3.h5
Epoch 3/20
131/131 [==============================] - 95s 729ms/step - loss: 0.3748 - accuracy: 0.8638 - val_loss: 0.2753 - val_accuracy: 0.9010

Epoch 00003: val_loss improved from 0.31991 to 0.27534, saving model to ./inceptionV3.h5
Epoch 4/20
131/131 [==============================] - 95s 728ms/step - loss: 0.3201 - accuracy: 0.8842 - val_loss: 0.2437 - val_accuracy: 0.9048

Epoch 00004: val_loss improved from 0.27534 to 0.24369, saving model to ./inceptionV3.h5
Epoch 5/20
131/131 [==============================] - 95s 726ms/step - loss: 0.3072 - accuracy: 0.8820 - val_loss: 0.2194 - val_accuracy: 0.9240

Epoch 00005: val_loss improved from 0.24369 to 0.21937, saving model to ./inceptionV3.h5
Epoch 6/20
131/131 [==============================] - 97s 741ms/step - loss: 0.2951 - accuracy: 0.8898 - val_loss: 0.2161 - val_accuracy: 0.9279

Epoch 00006: val_loss improved from 0.21937 to 0.21605, saving model to ./inceptionV3.h5
Epoch 7/20
131/131 [==============================] - 96s 730ms/step - loss: 0.2682 - accuracy: 0.8935 - val_loss: 0.2126 - val_accuracy: 0.9240

Epoch 00007: val_loss improved from 0.21605 to 0.21263, saving model to ./inceptionV3.h5
Epoch 8/20
131/131 [==============================] - 95s 723ms/step - loss: 0.2459 - accuracy: 0.9160 - val_loss: 0.2071 - val_accuracy: 0.9240

Epoch 00008: val_loss improved from 0.21263 to 0.20712, saving model to ./inceptionV3.h5
Epoch 9/20
131/131 [==============================] - 96s 729ms/step - loss: 0.2449 - accuracy: 0.9149 - val_loss: 0.1880 - val_accuracy: 0.9327

Epoch 00009: val_loss improved from 0.20712 to 0.18799, saving model to ./inceptionV3.h5
Epoch 10/20
131/131 [==============================] - 96s 733ms/step - loss: 0.2139 - accuracy: 0.9227 - val_loss: 0.1776 - val_accuracy: 0.9413

Epoch 00010: val_loss improved from 0.18799 to 0.17756, saving model to ./inceptionV3.h5
Epoch 11/20
131/131 [==============================] - 99s 757ms/step - loss: 0.2120 - accuracy: 0.9260 - val_loss: 0.1844 - val_accuracy: 0.9346

Epoch 00011: val_loss did not improve from 0.17756
Epoch 12/20
131/131 [==============================] - 95s 727ms/step - loss: 0.2379 - accuracy: 0.9061 - val_loss: 0.1803 - val_accuracy: 0.9394

Epoch 00012: val_loss did not improve from 0.17756
Epoch 13/20
131/131 [==============================] - 95s 723ms/step - loss: 0.2258 - accuracy: 0.9154 - val_loss: 0.1758 - val_accuracy: 0.9413

Epoch 00013: val_loss improved from 0.17756 to 0.17578, saving model to ./inceptionV3.h5
Epoch 14/20
131/131 [==============================] - 96s 732ms/step - loss: 0.2083 - accuracy: 0.9258 - val_loss: 0.1823 - val_accuracy: 0.9298

Epoch 00014: val_loss did not improve from 0.17578
Epoch 15/20
131/131 [==============================] - 95s 726ms/step - loss: 0.2107 - accuracy: 0.9238 - val_loss: 0.1937 - val_accuracy: 0.9250

Epoch 00015: val_loss did not improve from 0.17578
Epoch 16/20
131/131 [==============================] - 96s 730ms/step - loss: 0.1975 - accuracy: 0.9246 - val_loss: 0.1863 - val_accuracy: 0.9308
Restoring model weights from the end of the best epoch.

Epoch 00016: val_loss did not improve from 0.17578
Epoch 00016: early stopping
In [73]:
plot(incept_history)
In [74]:
incept_pred =[]
for image in test_data.image_path:
    incept_pred.append(predict(image , incept_model))
    
final_incept_pred  = np.argmax(incept_pred , axis=-1)
actual_label = test_data['label']

print(classification_report(actual_label, final_incept_pred))
matrix=confusion_matrix(actual_label, final_incept_pred)
sns.heatmap(matrix,square=True, annot=True, fmt='d', cbar=False,
            xticklabels=['0', '1'],
            yticklabels=['0', '1'])
plt.xlabel('Predicted label')
plt.ylabel('True label');
              precision    recall  f1-score   support

           0       0.90      0.64      0.75       234
           1       0.81      0.96      0.88       390

    accuracy                           0.84       624
   macro avg       0.86      0.80      0.81       624
weighted avg       0.85      0.84      0.83       624

6 - Final conclusion

CNN and Tranfer Learning algorithms like VGG-16, MobileNetV2, DenseNet169 and InceptionV3 has been used for this experiment¶

Models With Their Accuracy of Prediction¶

Capture.PNG